Approximate Dynamic Programming based on Projection onto the (min, +) subsemimodule
نویسندگان
چکیده
We develop a new Approximate Dynamic Programming (ADP) method for infinite horizon discounted reward Markov Decision Processes (MDP) based on projection onto a subsemimodule. We approximate the value function in terms of a (min,+) linear combination of a set of basis functions whose (min,+) linear span constitutes a subsemimodule. The projection operator is closely related to the Fenchel transform. Our approximate solution obeys the (min,+) Projected Bellman Equation (MPPBE) which is different from the conventional Projected Bellman Equation (PBE). We show that the approximation error is bounded in its L∞-norm. We develop a Min-Plus Approximate Dynamic Programming (MPADP) algorithm to compute the solution to the MPPBE. We also present the proof of convergence of the MPADP algorithm and apply it to two problems, a grid-world problem in the discrete domain and mountain car in the continuous domain.
منابع مشابه
On Sequential Optimality Conditions without Constraint Qualifications for Nonlinear Programming with Nonsmooth Convex Objective Functions
Sequential optimality conditions provide adequate theoretical tools to justify stopping criteria for nonlinear programming solvers. Here, nonsmooth approximate gradient projection and complementary approximate Karush-Kuhn-Tucker conditions are presented. These sequential optimality conditions are satisfied by local minimizers of optimization problems independently of the fulfillment of constrai...
متن کاملApproximate Incremental Dynamic Analysis Using Reduction of Ground Motion Records
Incremental dynamic analysis (IDA) requires the analysis of the non-linear response history of a structure for an ensemble of ground motions, each scaled to multiple levels of intensity and selected to cover the entire range of structural response. Recognizing that IDA of practical structures is computationally demanding, an approximate procedure based on the reduction of the number of ground m...
متن کاملOPTIMIZATION OF A PRODUCTION LOT SIZING PROBLEM WITH QUANTITY DISCOUNT
Dynamic lot sizing problem is one of the significant problem in industrial units and it has been considered by many researchers. Considering the quantity discount in purchasing cost is one of the important and practical assumptions in the field of inventory control models and it has been less focused in terms of stochastic version of dynamic lot sizing problem. In this paper, stochastic dyn...
متن کاملDynamic anomaly detection by using incremental approximate PCA in AODV-based MANETs
Mobile Ad-hoc Networks (MANETs) by contrast of other networks have more vulnerability because of having nature properties such as dynamic topology and no infrastructure. Therefore, a considerable challenge for these networks, is a method expansion that to be able to specify anomalies with high accuracy at network dynamic topology alternation. In this paper, two methods proposed for dynamic anom...
متن کاملSolution of Large Systems of Equations Using Approximate Dynamic Programming Methods
Abstract We consider fixed point equations, and approximation of the solution by projection on a low-dimensional subspace. We propose stochastic iterative algorithms, based on simulation, which converge to the approximate solution and are suitable for large-dimensional problems. We focus primarily on general linear systems and propose extensions of recent approximate dynamic programming methods...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1403.4175 شماره
صفحات -
تاریخ انتشار 2014